REMARKS 

Applicants request favorable reconsideration and allowance of the subject 
application in view of the preceding amendments and the following remarks. 

Claims 1, 3-5, 8, 12, 13 and 15-27 are pending in this application, with 
Claims 1,19, 22, 25 and 27 being independent. By this Amendment, Applicants have 
cancelled Claims 2, 6, 7, 9-11 and 14, and amended Claims 1, 3-5, 8, 12, 13 and 15-27. 

Applicants gratefully acknowledge the Examiner's indication that the 
application contains allowable subject matter, and that Claims 19 and 25 would be 
allowable if rewritten in independent form. Applicants have rewritten those claims in 
independent form and request allowance thereof. 

The drawings stand objected to for formal matters. Attached hereto are 
replacement drawing sheets for Figures 4-6, as required by the Examiner. In the 
replacement drawing sheets, "Target that user pays attention" has been changed to -Target 
that user pays attention to-. No new matter has been added. 

Claims 2, 4, 1 1, 13-16 and 19-21 stand objected to on formal grounds. 
Applicants have amended the claims to attend to the various matters giving rise to the 
objection. 

The specification stands rejected under 35 U.S.C. § 1 12, first paragraph. 
Attached hereto is a substitute specification which corrects informalities in the 
specification, as required by the Examiner. A marked-up version of the substitute 
specification, showing the changes made thereto, is also attached. No new matter has been 
added. 
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Claims 1-29 stand rejected under 35 U.S.C. § 101. Applicants have 
amended Claim 29 as suggested in the Office Action. With respect to Claims 1-21, it is the 
position in the Office Action that the claims could read on a device that has no physical 
embodiment. Applicants respectfully disagree inasmuch as each of those claims is directed 
to "an apparatus". Even if the apparatus may operate by means of software, the software 
would inherently be executed by hardware of some kind, in order for the term "apparatus" 
to have any meaning. Accordingly, Applicants traverse this basis for the rejection. With 
respect to Claims 22-28, it is the position set forth in the Office Action that method claims 
are non-statutory subject matter. The Office Action cites MPEP § 2106, IV- A, as 
supporting that position. Applicants do not understand §2106 section to provide any 
support for the position that method claims are non-statutory subject matter. Furthermore, 
Applicants submit that method claims are regularly allowed by the U.S. Patent and 
Trademark Office and are statutorily acceptable. Accordingly, Applicants also traverse this 
ground of the rejection. 

Thus, Applicants request withdrawal of the rejection under 35 U.S.C. § 101. 

Claims 2, 4-6, 9 and 24-26 stand rejected under 35 U.S.C. § 1 12, second 
paragraph, as being indefinite. Applicants have amended the claims to attend to the 
matters noted in the Office Action as giving rise to the rejection of Claims 2, 6 and 24-26. 
With respect to Claims 4, 5 and 9, Applicants submit that "discrimination information" 
would be readily understood by one of ordinary skill in the art, and thus is not indefinite. 

Accordingly, Applicants request withdrawal of the rejection under § 112, 
second paragraph. 
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Claims 1, 2 and 9 stand rejected under 35 U.S.C. § 102(b) as anticipated by 
"First Person Indoor/Outdoor Augmented Reality Application: ARQuake" ( Thomas et aU . 
Claims 1-3 and 7-10 stand rejected under 35 U.S.C. § 103(a) as unpatentable over Thomas 
et al. in view of "Integrating Virtual and Augmented Realities in an Outdoor Application" 
( Piekarski et aU . Claims 4 and 5 stand rejected under 35 U.S.C. § 103(a) as unpatentable 
over Thomas et al in view of "Tinmith-Metro: New Outdoor Techniques for Creating City 
Models with an Augmented Reality Wearable Computer" ( Piekarski et al. 2001). Claim 6 
stands rejected under 35 U.S.C. § 103(a) as unpatentable over Piekarski et al. in view of 
U.S. Patent Publication 2002/0164066 ( Matsumoto) . Claims 11, 12, 15-18, 22-23 and 27 
stands rejected under 35 U.S.C. § 103(a) as unpatentable over Thomas et al. in view of 
Piekarski et al. and "First Person Indoor/Outdoor Augmented Reality Application: 
ARQuake" ( Thomas, et al. 2002). Claims 20, 21, 24 and 26 stand rejected under 35 U.S.C. 
§ 103(a) as unpatentable over Thomas et al. in view of Piekarski. et al . Sierra™ - Sarsiege: 
Tribes!® Game ( Tribes ) and Thomas et al. 2002. 

As recited in independent Claim 1, Applicants invention is directed to an 
information presentation apparatus in which a viewpoint position and orientation 
measurement unit measures a position and orientation of a user's viewpoint, and in which 
an input unit inputs viewpoint position and orientation information of an other user. An 
annotation image generation unit generates an annotation image from annotation data, 
based on position and orientation information of the user and the viewpoint position and 
orientation information of the other user. The image of the real world, the virtual world 
image and the annotation image are composited and displayed. 
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Independent Claim 22 is directed to an information processing method. The 
method includes the steps of generating a virtual world image according to viewpoint 
information, by using previously held virtual world data; inputting viewpoint information 
of an other user, and generating an annotation concerning an attention target based on the 
viewpoint information of the user and the viewpoint information of the other user. An 
image is generated based on the image of the real world, a generated virtual world image 
and the generated annotation. 

Independent Claim 27 is directed to a computer-executable program for 
causing a computer to perform an information processing method. The method includes 
steps generally similar to those recited in independent Claim 22. 

With the configuration of the present invention, in a system in which mixed 
reality space is shared by plural users, it is possible to indicate to one user the status of 
viewpoint information of an other users. 

Thomas et al. and Thomas et al. 2002 generally describe that a virtual image 
is synthesized onto a real image to form a mixed reality in gaming. Although the systems 
described in those documents discuss plural users, the annotation information concerning 
an other user is not generated and displayed, as in the present invention. 

Piekarski et al. is generally cited as describing a wireless connection in 
gaming. While the Office Action also cites this document as describing annotated 
information, Applicants submit that Piekarski et al. does not describe the use of viewpoint 
information of another user, as now recited in the independent claims. 

Piekarski et al. 2001 is cited in the Office Action as describing gaming 
environments utilizing information about polygons. Pham et al. is cited in the Office 

-15- 



Action as teaching specific algorithms to be used in mixed reality systems. Tribes is cited 
in the Office Action as describing the use of colored arrows to point out teammates and 
enemies in gaming displays. Applicants submit that these documents fail to remedy the 
deficiencies discussed above with respect to Thomas et aU Thomas et al. 2002 and 
Piekarski et al. 

Accordingly, Applicants submit that the applied references, when taken 
alone or in combination, fail to disclose or suggest at least the features of an annotation 
image generation unit generating an annotation image from the annotation data, based on 
position and orientation information of a user and viewpoint position and orientation 
information of an other user, and a composite unit compositing the image of the real world, 
the virtual world image and the annotation image, as generally recited in independent 
Claim 1 . Applicants also submit that these documents fail to disclose or suggest at least 
the features of inputting viewpoint information of a user; inputting viewpoint information 
of an other user; generating an annotation concerning an attention target based on the 
viewpoint information of the user and the viewpoint information of the other user; and 
generating an image obtained by synthesizing an image of a real world, a generated virtual 
world image and the generated annotation, as generally recited in independent Claims 22 
and 27. 

The remaining claims in this application are dependent claims which 
depend from the independent claims discussed above. Thus, Applicants submit that the 
dependent claims are allowable for at least the reasons discussed above with respect to the 
independent claims. Those dependent claims also recite additional features further 
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distinguishing them from the cited references. Applicants request favorable and 
independent consideration thereof. 

For the foregoing reasons, Applicants request withdrawal of the rejections 
under 35 U.S.C. §§ 102 and 103. 

Applicants' undersigned attorney may be reached in our Washington, D.C. 
office by telephone at (202) 530-1010. All correspondence should continue to be directed 
to our below-listed address. 



Respectfully submitted, 




Justin J. Oliver 
(^tomeyjen Applicants 
Registration No. 44,986 



FITZPATRICK, CELLA, HARPER & SCINTO 
30 Rockefeller Plaza 
New York, New York 10112-3801 
Facsimile: (212)218-2200 
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Substitute Specification (marked-up version) 
Appln.No. 10/626,590 
Atty. Docket No. 03 500.0 J 7440 

5 

INFORMATION PRESENTATION APPARATUS AND INFORMATION 
1 0 PROCESSING METHOD THEREOF 

BACKGROUND OF THE INVENTION 
Field of the Invention 

The present invention relates to an information presentation apparatus 
1 5 which presents an image obtained by synthesizing (or composing) a real world and 
a virtual world, and an image processing method of the information presentation 
apparatus. 

Related Background Art 

Recently, proposals have been made for an apparatus [[to]] which would 

20 use and apply a mixed reality (MR) technique of naturally combining a real world 
and a virtual world with each other without uncomfortableness is applied is 
actively proposed . Among them[[,]] is an apparatus which superimposes 
sup e rposes virtual information on the real world and/or the virtual world observed 
by a user wearing a head mounted display (HMD), and presents the obtained 

25 information to the user is proposed, whereby it is expected in this apparatus to 
improve working properties concerning engineering work, maintenance and the 
like. For example, a method of supporting a surveying work by superposing virtual 
flags on the image of the real world and then displaying the obtained image on the 
user's HMD is proposed. However, many of these apparatuses are premised on ttsc 

30 of being used by only a single one user, whereby it is difficult to say that these 
apparatuses are suitable for the working in which uses such as conferences, 
lectures, cooperation or the like that shares involves sharing a single mixed reality 
(MR) space with plural persons is necessary . 
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In other words, in the a case where the plural two or more persons perform 
the are involved in a conference, the lecture or the like by sharing the single MR 
space, it is necessary for these persons to observe the same target and thus share 
the information concerning the target in question. 

5 

SUMMARY OF THE INVENTION 

An object of the present invention is to be able to provide predetermined 
information to users, by superposing an annotation on an image obtained by 
synthesizing a real world and a virtual world. 
10 For example, the present invention aims to provide a means for 

transmitting a target that one user wishes to cause another user to pay attention, a 
means for knowing position and direction of a target that users should pay 
attention, or a means for knowing whether or not a target that to which one user is 
paying attention at present is observed by another user. 
15 In order to achieve the above object, one aspect of the present invention is 

charact e rized by an information presentation apparatus comprising: 

a user operation input unit, adapted to input an operation of a user; 

a user viewpoint position and pose measurement unit, adapted to measure 
a position and pose at a user's viewpoint; 
20 a model data storage unit, adapted to store virtual world model data, real 

world model data, and data necessary to generate a virtual world image; 

an annotation data storage unit, adapted to store data necessary to be 
added to a real world and a virtual world and then displayed; 

a virtual image generation unit, adapted to generate an image of the virtual 
25 world by using information in the user viewpoint position and pose measurement 
unit, the model data storage unit and the annotation data storage unit; 

a user viewpoint image input unit, adapted to capture an image of the real 
world viewed from the user's viewpoint; and 

an image display unit, adapted to display an image obtained by 
30 synthesizing the image generated by the virtual image generation unit and the 



-3- 



image obtained by the user viewpoint image input unit, on an image display device 
of the user. 

Moreover, to achieve the above object, another aspect of the present 
invention is characterized by an information processing method comprising the 
5 steps of: 

inputting viewpoint information of a user; 

generating a virtual world image according to the viewpoint information, 
by using previously held virtual world data; 

generating an annotation concerning an attention target; and 
10 generating an image obtained by synthesizing an image of a real world, 

generated virtual world image and the generated annotation. 

Moreover, to achieve the above object, the present invention is 
characterized by a program to achieve an information processing method 
comprising the steps of: 
15 inputting viewpoint information of a user; 

generating a virtual world image according to the viewpoint information, 
by using previously held virtual world data; 

generating an annotation concerning an attention target; and 

generating an image obtained by synthesizing an image of a real world, 
20 generated virtual world image and the generated annotation. 

Other objects and features of the present invention will become apparent 
from the following description taken in conjunction with the accompanying 
drawings. 

25 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a block diagram schematically showing the structure of an 
information presentation apparatus according to the one embodiment of the 
invention ; 

Fig. 2 is a block diagram showing the structure in a case where plural 
30 information presentation apparatuses are mutually connected together through a 
transmission channel, according to the embodiment of Fig. 1 ; 
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Fig. 3 is a flow chart for explaining a process ing procedure in the 
information presentation apparatus; 

Fig. 4 is a diagram for explaining a means whic h informs , in a case where 
a target that to which a watching user is pay[[s]]ing attention is outside a visual 
5 range of a watched user, informs the watched user of the position of the target, 
according to the that embodiment; 

Fig. 5 is a diagram for explaining a means whic h informs , in a case where 
the target that to which the watching user is pay[[s]]mg attention is inside the 
visual range of the watched user, informs the watched user of the target and 
10 information concerning the target, according to the that embodiment; 

Fig. 6 is a diagram for explaining a means which presents to the watching 
user whether or not the target that to which the watching user is pay[[s]]ing 
attention is inside the visual range of each watched user, according to the that 
embodiment; and 

1 5 Fig. 7 is a diagram for explaining a means which presents to a user 

positions where other users exit, according to the that embodiment. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Hereinafter, the preferred embodiments of the present invention will be 

20 described with reference to the accompanying drawings. 
(One Embodiment) 

Fig. 1 is a block diagram schematically showing the enti r e overall 
structure [[to]] in which an information presentation apparatus and an information 
presentation method according to the this embodiment are applied. 

25 A user operation input unit 101 is an input device which consists of, e.g., 

push button switches, a mouse, a joystick and the like. When a user of an 
information presentation apparatus 100 operates or handles the user operation input 
unit 101, control information according to operation(s) contents by the user is 
transferred to a virtual image generation unit 105. 

30 A user viewpoint position and pose measurement unit 102 is a position 

and pose measurement device such as a magnetic sensor, an optical sensor or the 
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like. The user viewpoint position and pose measurement unit 102 measures a 
position and pose at a user's viewpoint by six degrees of freedom and transfers a 
measured result to the virtual image generation unit 105. Since it is generally 
difficult to set the position and pose measurement device at the user's viewpoint, 
5 the user viewpoint position and pose measurement unit 102 has a function to 

calculate the position and pose at the user's viewpoint on the basis of the output 
result of the position and pose measurement device. For example, in a case where 
the position and pose measurement device is fixed to a user's head, a relation 
between the output of the position and pose measurement device and the position 
10 and pose at the user's viewpoint is always maintained constant, whereby the 

relation is expressed by a certain expression. Therefore, by obtaining the certain 
expression in advance, the position and pose at the user's viewpoint is calculated 
based on the output from the position and pose measurement device. Besides, In 

addition, it is possible to capture an image of a real world by using a user 
15 viewpoint image input unit 106, and thus perform [[an]] image process ing on [[of]] 
the captured image to correct an error in the output result of the position po[[s]]se 
measurement device. In the this image processing, for example, positions of plural 
feature points of which the three-dimensional coordinates in a real space have been 
known are detected from the image, the detected positions are compared with the 
20 positions of feature points of the image calculated from the output result of the 

position and pose measurement device to calculate the error in the output result of 
the position po[[s]]se measurement device, and the output result of the position and 
pose measurement device is corrected so as to delete the calculated error. 
Moreover, it is possible to calculate the position and pose at the user's viewpoint 
25 only from the image process ing . 

A model data storage unit 103 is an auxiliary storage device or medium 
such as a hard disk, a CD-ROM or the like. The model data storage unit 103 holds 
and stores virtual world model data necessary to draw a virtual world by computer 
graphics (CG), real world model data necessary [[to]] accurately to synthesize the 
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real world and the virtual world, and data necessary to generate a virtual world 
image. Here, the virtual world model data includes three-dimensional coordinates 
of vertices of a polygon of a virtual CG object arranged on the virtual world, 
structure information of faces of the polygon, discrimination information of the CG 
5 object, color information of the CG object, texture information of the CG object, 
size information of the CG object, position and pose information indicating the 
arrangement of the CG object in the virtual world, and the like. The real world 
model data includes three-dimensional coordinates of vertices of a polygon of an 
object existing in the real world merged with the virtual world, structure 

10 information of faces of the polygon, discrimination information of the object, size 
information of the object, position and pose information indicating the arrangement 
of the object, and the like. The data necessary to generate the virtual world image 
includes size and angle of an image pickup element of an image pickup device of 
the user viewpoint image input unit 106, and internal parameters such as an angle 

15 of view of a lens, a lens distortion parameter and the like. The information stored 
in the model data storage unit 103 is transferred to the virtual image generation unit 
105. Here, the model data storage unit 103 is not limited to that one provided 
inside the information presentation apparatus 100, that is, the model data storage 
unit 103 may be provided outside the information presentation apparatus 100 so as 

20 to transfer the data to the virtual image generation unit 105 through a transmission 
channel 200. 

An annotation data storage unit 104 is an auxiliary storage device or 
medium such as a hard disk, a CD-ROM or the like. The annotation data storage 
unit 104 holds and stores annotation data which indicates additional information to 

25 be displayed on the real world and the virtual world. The annotation data includes 
position and pose information of the object in the real world and the virtual world, 
discrimination information of the object, and text, symbol and image information 
for indicating the object to a user. Here, the annotation data storage unit 104 is not 
limited to that one provided inside the information presentation apparatus 100, that 

30 is, the annotation data storage unit 104 may be provided outside the information 
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presentation apparatus 100 so as to transfer the data to the virtual image generation 
unit 105 through the transmission channel 200. 

The virtual image generation unit 105 is actualized by a CPU, a 
microprocessor unit (MPU) or the like mounted in a computer. On the basis of 
5 position and pose information indicating the position and pose at the user's 

viewpoint obtained from the user viewpoint position and pose measurement unit 
102, the virtual image generation unit 105 sets the user's viewpoint in the virtual 
world, draws the model data stored in the model data storage unit 103 by the CG 
from the set viewpoint, and thus generates the image of the virtual world viewed 

10 from the user's viewpoint. Moreover, as shown in Fig. 2, the virtual image 

generation unit 105 which has a function to transmit the data to the transmission 
channel 200 and receive the data from the transmission channel 200 is connected 
mutually to a virtual image generation unit 105 of another information presentation 
apparatus 100 through the transmission channel 200 so as to exchange necessary 

15 information between them. Thus, plural users use the respective information 

presentation apparatuses 100, whereby they can share the same (or identical) MR 
space. Fig. 2 is the block diagram showing the structure in the case where the 
plural information presentation apparatuses 100 mutually connected together 
through the transmission channel 200 are used by the plural users. In accordance 

20 with the position and pose at the user's viewpoint obtained from the user viewpoint 
position and pose measurement unit 102 and the position and pose of another user's 
viewpoint obtained through the transmission channel 200, the virtual image 
generation unit 105 generates an annotation to be presented to the user, on the basis 
of the annotation data stored in the annotation data storage unit 104. Then, the 

25 virtual image generation unit 105 superimposes sup e rposes the generated 

annotation on the image of the virtual world, and further displays the obtained 
image. Here, the generated annotation is not limited to a two-dimensional 
annotation. That is, the virtual image generation unit 105 may generate a three- 
dimensional annotation and draw the generated annotation by the CG together with 

30 the virtual world model stored in the model data storage unit 103. Incidentally, the 
virtual image generation unit 105 has a function to operate the virtual world and 



control the generated annotation according to user f s operation information 
transferred from the user operation input unit 101. 

The user viewpoint image input unit 106 which includes one or two image 
pickup devices such as a CCD camera or the like captures an image of the real 
5 world which greets the user's eyes and then transfers the captured image to an 
image display unit 107. Here, in a case where the image display unit 107 is 
equipped with an optical see-through HMD, the user can directly observe the real 
world, whereby the user viewpoint image input unit 106 is unnecessary in this case. 

The image display unit 107 includes an image display device such as the 
10 HMD, a display or the like. The image display unit 107 synthesizes the image of 
the real world greeting the user's eyes and captured by the user viewpoint image 
input unit 106 and the image of the virtual world generated by the virtual image 
generation unit 105 together and then displays the synthesized image right in front 
of the user's eyes. Here, in the case where the image display unit 107 is equipped 
15 with the optical see-through HMD, the image of the virtual world generated by the 
virtual image generation unit 105 is displayed right in front of the user's eyes. 
Here, it should be noted that the image display unit 107 also acts as an image 
drawing unit according to an operation. 

The transmission channel 200 is a medium which achieves a wired or 
20 wireless computer network. The plural information presentation apparatuses 100 
are connected to the transmission channel 200, whereby the data to be mutually 
exchanged among the information presentation apparatuses 100 flows in the 
transmission channel 200. 

Hereinafter, control of the embodiment in which the above structure is 
25 provided will be explained. Fig. 3 is a flow chart for explaining a process 

procedure in the information presentation apparatus according to the embodiment. 

In a step S000, the information presentation apparatus is activated, and a 
process necessary for initialization is performed. 

In a step SI 00, the user's operation [[to]] with the user operation input unit 
30 101 is recognized and converted into a control signal according to the operation 
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content, and the obtained control signal is transferred to the virtual image 

generation unit 105. 

In a step S200, the position and pose information indicating the position 

and pose at the user's viewpoint is measured by the user viewpoint position and 
5 pose measurement unit 102, and the obtained information is transferred to the 

virtual image generation unit 105. 

In a step S300, the image of the real world viewed from the user's 

viewpoint is captured by the user viewpoint image input unit 106, and the captured 

image is then transferred to the image display unit 107. Here, in the case where the 
10 image display unit 107 is equipped with the optical see-through HMD as the 

display, the user can directly observe the real world, whereby the process ing in the 

step S200 is unnecessary. 

In a step S400, communication data is received by the virtual image 

generation unit 105 through the transmission channel 200. For example, the 
1 5 communication data includes identification number information of each user using 

the information presentation apparatus 100, name information capable of 

discriminating each user, position and pose information of each user's viewpoint, 

operation information of each user, the annotation data and the like. 

In a step S500, the annotation to be presented to the user is determined by 
20 the virtual image generation unit 105 on the basis of the user's operation 

information obtained in the step SI 00, the position and pose information at the 

user's viewpoint obtained in the step S200, and the information concerning other 

user obtained in the step S400. 

In the step S500, when a target in the real world or the virtual world that 
25 one user pays attention is notified to other users so that the other users pay 

attention to it, the plural users resultingly share the information in the MR space, 

whereby it is very useful for the plural users to perform working in which 

conference, lecture, cooperation or the like is necessary. Hereinafter, a means to 

achieve such an effect will be explained. 
30 First, the data concerning the target that the user pays attention at present 

is retrieved and selected from the information of the objects in the real world and 
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the virtual world stored in the annotation data storage unit 104. Incidentally, the 
target that the user pays attention may be automatically recognized and selected by 
the information presentation apparatus 100 or manually selected according to the 
user's operation on the user operation input unit 101 . 
5 In the method of automatically recognizing the target that the user pays 

attention, it is thought to use the position and pose information indicating the 
position and pose at the user's viewpoint obtained in the step S200 and the internal 
parameters of the image pickup device held in the model data storage unit 103. 

Incidentally, in the step S500, all candidates of the targets existing inside 

10 the user's visual range are captured from the annotation data storage unit 104 on the 
basis of the internal parameters of the image pickup device and the position and 
pose information indicating the position and pose at the user's viewpoint. Then, in 
regard to the captures candidate, a Euclidean distance between a user's visual line 
and a point representative of the target is calculated, and the candidate for which 

15 the Euclidean distance is minimum can be considered as an attention target. 

In case of judging whether or not one target is within the user's visual 
range, for example, it is thought to do so by the calculation from the position and 
pose information indicating the position and pose at the user's viewpoint obtained 
from the user viewpoint position and pose measurement unit 102 and the internal 

20 parameters of the image pickup device provided in the user viewpoint image input 
unit 106. That is, the target is projected on an image screen from the position and 
pose at the user's viewpoint by using the internal parameters of the image pickup 
device. Then, when the coordinates of the target projected on the image screen 
exist within a certain range defined by the size of the image, it is judged that the 

25 target in question is within the user's visual range. 

It is assumed that a matrix created from the internal parameters of the 
image pickup device is given as follows. 

a u -a u cot9 u 0 

K = 0 a v /sin0 v 0 
30 0 0 1 
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where each of the symbols a u and a v indicates a pixel size of the image pickup 

device, the symbol 8 indicates an angle between the longitudinal and lateral axes of 

the image pickup element, and the symbols Uq and v 0 indicate coordinates of the 
pixel center. Moreover, it is assumed that a matrix created from the position and 
5 pose at the user's viewpoint is P = (Rt), where the symbol R indicates a rotation 
matrix of three rows and three columns representing the pose at the user's 
viewpoint, and the symbol t indicates a three-dimensional vector of the position of 
the user's viewpoint. Besides, it is assumed that the three-dimensional coordinates 
of the target are given as x = (X, Y, Z, 1) T by using the expression of the 

10 homogeneous coordinates, and the coordinates of the point of the target projected 
on the image screen are given as u = (u, v, w) T by using the expression of the 
homogeneous coordinates. 

The coordinates u of the point of the target projected on the image screen 
can be obtained by calculation of u = KP _1 x. Then, when it is assumed that [[a]] the 

15 range of the image in the u-axis direction is [u^, iv^] and [[a]] the range of the 
image in the v-axis direction is [v min , v max ], if u^ ^u su^ and v min <v <v max are 

satisfied, it can be known that the target in question is within the user's visual 
range. 

To calculate a distance between a straight line obtained from the position 
20 and pose at the user's viewpoint and the point representative of the target, it is 

thought to obtain the vector which passes the point representative of the target and 
crosses the user's visual line and then calculate the minimum value of the length of 
the vector in question. 

The user's visual line is expressed as v = t + kp, where the symbol t 
25 indicates the three-dimensional vector of the position of the user's viewpoint, the 
symbol p indicates a three-dimensional vector of the pose at the user's viewpoint, 
and the symbol k is a real number other than "0." 

Moreover, the point representative of the target is expressed by a three- 
dimensional vector b. Then, when it is assumed that the point where the vector 
30 passing the three-dimensional vector b and orthogonal to the visual line crosses the 
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visual line is given as t + mp, the value m which minimizes the distance between 
the point t + mp and the three-dimensional vector b may be obtained. That is, || t + 

mp - b || is the distance between the visual line and the point representative of the 
target. 

5 When this distance is calculated, || t — b + (p*(b - 1)/|| b || 2 )p || is obtained. 

Incidentally, as a method of selecting the target that the user pays attention 
by handling and operating the input device of the user operation input unit 101, it is 
thought that the watching user operates the input device by using the mouse or the 
joystick as watching the synthesized image displayed on the image display unit 

10 107. For example, the user handles and moves the mouse to the position where the 
attention target is being displayed, and then depresses the button of the mouse at 
that position, thereby selecting the desired target. Then, when the cursor handled 
by the user reaches the position where the object stored in the annotation data 
storage unit 104 is being displayed, the user can confirm whether or not the data 

15 concerning the object is being stored in the annotation data storage unit 104 by 
generating the annotation concerning the object. 

An identification number of the target that the user pays attention is 
transferred to the transmission channel 200 in a step S600. At the same time, also 
a user identification number and the position and pose information are 

20 also transferred to the transmission channel 200. Moreover, in the step S400, the 
identification number information of the target that to which another user is 
pay[[s]]ing attention, the [[an]] other user's identification number and the position 
and pose information are received from the transmission channel 200. 

In the virtual image generation unit 105 of the information presentation 

25 apparatus 100 which is used by one user (called a watched user hereinafter), when 
it is judged that the target that another user (called a watching user hereinafter) 
pays attention is outside the visual range of the watched user, the annotation 
indicating the direction of the target is generated. This annotation includes 
symbols, characters, images and the like. To enable to easily recognize the target 

30 that which watching user is paying attention, it is possible to generate an 
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annotation of which the attributes such as a color, a shape, a character type and the 
like have been changed in regard to each watching user, or an annotation which 
indicates a name capable of discriminating the watching user. Thus, when the 
watched user turns toward the direction indicated by the annotation, he can watch 
5 the target that the watching user is observing. 

When the target that the watching user is paying attention is inside the 
visual range of the watched user, the annotation indicating the information of the 
target in question is generated. At that time, the attributes of the generated 
annotation such as the color, the shape, the character type and the like are made 
10 different from those of other annotation so as to make the generated annotation 
remarkable. 

Moreover, when the watched user uses the input device of the user 
operation input unit 101, he or she can control the target of the generated 
annotation. For example, it is possible to select the specific watching user and then 

15 generate only the annotation concerning the target that the selected specific 
watching user pays attention. On the contrary, it is possible to generate the 
annotation concerning the target that to which all the watching users are pa ying 
attention. In this case, such a selection is performed not only by the watched user's 
operation with use of the input device but also by the previous input before the step 

20 S000. 

Fig. 4 shows a situation that, in a case where a user 1 bemg who is the 
watching user observes a certain building and the building is outside the visual 
range of a user 2 bemg who is the watched user, the arrow indicating the direction 
of the building and the annotation indicating the name of the user 1 are generated 

25 and displayed on the screen to be presented to the user 2. 

Fig. 5 shows a situation that in which , in a case where the user 1 being 
who is the watching user is paying attention to the certain building and the building 
is inside the visual range of the user 2 being who is the watched user, the 
annotation (black background and white text) indicating the name of the building is 

30 generated and displayed on the screen to be presented to the user 2. In this 
situation, the attributes (black background and white text) of the generated 
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annotation are made different from the attributes (white background and black text) 
of another annotation, so as to make the generated annotation remarkable. 

In the information presentation apparatus 100 which is used by the 
watching user, in a case of generating the annotation of the information concerning 
5 the attention target, the attributes (color, shape, character type, etc.) of the 

annotation to be generated are made different from those of other annotations so as 
to make the annotation to be currently being generated remarkable , that is. to make 
it easy to distinguish . Moreover, the annotation of the information indicating 
whether or not the attention target is being observed by the watched user is 
10 generated and is presented to the watching user. 

Fig. 6 shows a situation that the user 1 being who is the watching user is 
paying attention to the certain building, and the annotation (black background and 
white text) indicating the name of the building is generated and is made to have the 
attributes different from those of other annotations so as to make the generated 
15 annotation remarkable. Moreover, Fig. 6 shows a situation that in which the 
annotation of the information indicating whether or not the watched users are 
paying attention to the building is generated and displayed. 

Moreover, in the information presentation apparatus 100 of each user, in a 
case where another user exists inside the visual range of the user in the real world, 
20 the annotation indicating the name capable of discriminating that ([[an]]other) user 
is generated. On the contrary, in a case where another user does not exist inside 
the visual range of the user in the real world, the annotation including the arrow 
indicating the direction of each user and the name capable of discriminating that 
user is generated. 

25 Fig. 7 shows a situation that the annotation indicating the position of a 

user 4 existing inside the visual range of the user 1 is generated and displayed on 
the image screen of the user 1, and the annotation including the arrow indicating 
the direction of the users 2 and 3 existing outside the visual range of the user 1 and 
the names capable of discriminating these users is generated and displayed on the 

30 image screen of the user 1 . 



• I) I 
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In the step S600, the communication data is transferred from the virtual 
image generation unit 105 to the transmission channel 200. For example, the 
communication data includes the identification number information of each user 
using the information presentation apparatus 100, the name information capable of 
5 discriminating each user, the position and pose information of each user's 

viewpoint, the operation information of each user, the annotation data and the like. 

In a step S700, in accordance with the model data stored in the model data 
storage unit 103, the user's viewpoint is set based on the position and pose 
information at the user's viewpoint obtained in the step S200, and the virtual world 

10 which can be viewed from that viewpoint is drawn. Moreover, the annotation 

determined in the step S600 is superimposed superposed and drawn on the image 
of the virtual world. 

In the step S700, the image of the real world viewed from the user's 
viewpoint position and obtained in the step S200 may be first drawn as the 

1 5 background, and the virtual world and the annotation may be then superimposed 
supcipos e d and drawn on the background. At that time, in [[a]] step S800, a 
process of only outputting the image obtained as the result of the drawing to the 
image display device. 

In the step S800, the image of the real world viewed from the user's 

20 viewpoint position and obtained in the step S200 and the image of the virtual world 
generated in the step S700 are synthesized, and then the synthesized image is 
drawn and output to the image display device. Here, in the case where the image 
display device of the image display unit 107 is equipped with the optical see- 
through HMD, the image of the virtual world is drawn and output to the image 

25 display device. 

In [[a]] step S900, it is judged whether or not to end the operation of the 
information presentation apparatus 100. When it is judged not to end the 
operation, then the flow returns to the step SI 00, while when it is judged to end the 
operation, the process ends as a whole. 

30 According to the present embodiment, it is possible to notify another user 

of the target that to which one user wishes to cause the other user to pay attention, 
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it is possible for the user to know the position and the direction of the target that 
the user should pay attention in question , and it is further possible to know whether 
or not the target that to which one user is paying attention at present is observed by 
other user. Therefore, it is easy to perform the working in which the conferences, 
5 fee lectures, the cooperation collaborative work or the like that in which it is 
necessary for shares the single mixed reality space to be shared with the plural 
persons is necessary . 

(Other Embodiment) 

10 The object of the present invention can also be achieved even in a case 

where a storage medium (or a recording medium) storing therein program codes of 
software to realize the functions of the above embodiment is supplied to a system 
or an apparatus, and thus a computer (or CPU, MPU) in the system or the apparatus 
reads and executes the program codes stored in the storage medium. In this case, 

1 5 the program codes themselves read from the storage medium realize the functions 
of the above embodiment, whereby the storage medium storing these program 
codes constitutes the present invention. Moreover, it is needless to say that the 
present invention includes not only a case where the functions of the above 
embodiment are realized by executing the program codes read by the computer, but 

20 also a case where an operating system (OS) or the like running on the computer 
performs a part or all of the actual processes on the basis of instructions of the 
program codes and thus the functions of the above embodiment are realized by 
such processes. 

Moreover, it is needless to say that the present invention also includes a 
25 case where, after the program codes read from the storage medium are written into 
a function expansion card inserted in the computer or a memory in a function 
expansion unit connected to the computer, a CPU or the like provided in the 
function expansion card or the function expansion unit performs a part or all of the 
actual processes on the basis of the instructions of the program codes, and thus the 
30 functions of the above embodiments are realized by such processes. 
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As many apparently widely different embodiments of the present 
invention can be made without departing from the spirit and scope thereof, it is to 
be understood that the present invention is not limited to the specific embodiments 
thereof expect as defined in the appended claims. 
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ABSTRACT OF THE DISCLOSURE : 

An information presentation apparatus comprises an input unit, a 
measurement unit to measure a user's viewpoint position and pose, a model data 
storage unit to store virtual world model data, real world model data, and data 
5 necessary to generate a virtual world image, an annotation data storage unit to store 
data added to real and virtual worlds and displayed, a generation unit to generate an 
image of the virtual world by using information in the measurement unit, the model 
data storage unit and the annotation data storage unit, a user viewpoint image input 
unit to capture a real-world image viewed from the user's viewpoint, and an image 
10 display unit to display an image obtained by synthesizing the image from the 
generation unit and the image from the user viewpoint image input unit or the 
image from the user viewpoint image input unit, on a user's image display. 
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